The effect of constraints on information loss and risk for clustering and modification based graph anonymization methods
نویسندگان
چکیده
In this paper we present a novel approach for anonymizing Online Social Network graphs which can be used in conjunction with existing perturbation approaches such as clustering and modification. The main insight of this paper is that by imposing additional constraints on which nodes can be selected we can reduce the information loss with respect to key structural metrics, while maintaining an acceptable risk. We present and evaluate two constraints, 'local1' and 'local2' which select the most similar subgraphs within the same community while excluding some key structural nodes. To this end, we introduce a novel distance metric based on local subgraph characteristics and which is calibrated using an isomorphism matcher. Empirical testing is conducted with three real OSN datasets, six information loss measures, five adversary queries as risk measures, and different levels of k-anonymity. The result show that overall, the methods with constraints give the best results for information loss and risk of disclosure.
منابع مشابه
A Comparison of Clustering and Modification based Graph Anonymization Methods with Constraints
In this paper a comparison is performed on two of the key methods for graph anonymization and their behavior is evaluated when constraints are incorporated into the anonymization process. The two methods tested are node clustering and node modification and are applied to online social network (OSN) graph datasets. The constraints implement user defined utility requirements for the community str...
متن کاملAn Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling
In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملNovel Approaches for Privacy Preserving Data Mining in k-Anonymity Model
In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymization based approaches for preserving the sensitive private information of an individual. The k-anonymity model is one of the widely used anonymization based approach. However, the anonymization based approaches suffer from the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1401.0458 شماره
صفحات -
تاریخ انتشار 2014